fix: use tool_responses role for gemma4 models in LiteLLM integration#5655
fix: use tool_responses role for gemma4 models in LiteLLM integration#5655jfrometa88 wants to merge 6 commits into
Conversation
|
Hi @jfrometa88 , Thank you for your contribution! We appreciate you taking the time to submit this pull request. Your PR has been received by the team and is currently under review. We will provide feedback as soon as we have an update to share. |
|
Hi @xuanyang15 , can you please review this. |
| if isinstance(response, str) | ||
| else _safe_json_serialize(response) | ||
| ) | ||
| tool_role = "tool_responses" if "gemma4" in model.lower() else "tool" |
There was a problem hiding this comment.
Should this be more general? or we only need to do this for gemma4?
There was a problem hiding this comment.
I looked into this more carefully before broadening the check.
Gemma models before version4 do not support tool use at all, so they would never reach this code path. The tool_responses role convention is specific to Gemma4's chat template, making "gemma4" the correct and intentional scope for this fix.
Other models with non-standard tool response formats (e.g. Liquid AI's LFM2, which uses its own tokenization scheme with custom delimiters like <|tool_response_start|>) represent a structurally different problem that would require a separate solution — conflating them here would overcomplicate this fix.
Happy to add a comment in the code explaining why the check is scoped to gemma4 specifically, if that would help reviewers in the future.
There was a problem hiding this comment.
I see, thanks for the detailed explanation! Using "gemma4" sounds good to be. Could you please add the comment for future reference?
There was a problem hiding this comment.
Added comments explaining why the check is scoped to gemma4. Ready for another look whenever you get a chance
|
|
||
| healed_messages.append(message) | ||
|
|
||
| # Bloque final también usa expected_tool_role |
There was a problem hiding this comment.
Could you please use english for the comments, consistent with others in this repo?
| return msg.role | ||
|
|
||
|
|
||
| # Tests: single |
There was a problem hiding this comment.
These comments doesn't seem to be very necessary, could you please remove them?
xuanyang15
left a comment
There was a problem hiding this comment.
Thanks for creating the PR! Left some minor coments.
Link to Issue or Description of Change
Closes: #5650
Problem
Gemma4 models served via Ollama do not recognise
role: "tool"in the message history. When tool results were appended using this role, the model entered an infinite loop re-calling the tool instead of generating a final response.Two separate code paths were affected:
_content_to_message_param— builds tool result messages fromfunction_responseparts._ensure_tool_results— heals incomplete histories by injecting placeholder messages when a tool call has no matching result. This function had both the trigger condition and the fallback block hardcoded torole: "tool", so it neither recognised incomingtool_responsesmessages as valid resolutions nor generated the correct role in the placeholder it injected.Solution
Introduced a model-aware role selector in both affected functions. When the model name contains
gemma4, the role is set to"tool_responses"; otherwise it falls back to the standard"tool".In
_content_to_message_param:tool_role = "tool_responses" if "gemma4" in model.lower() else "tool"
In
_ensure_tool_results, a singleexpected_tool_rolevariable is computed once before the loop and reused consistently in three places: the healing trigger condition, theelifbranch that clears resolved call IDs, and the post-loop fallback block. This eliminates the previous dual-condition pattern where the twoifblocks were not mutually exclusive and could both fire on the same message.Testing Plan
Unit Tests:
Added
tests/unittests/test_lite_llm_gemma_tool_role.py.Covers:
gemma4, the generated message usesrole: "tool_responses"instead ofrole: "tool".function_responseparts are present in a single content object.role: "tool"is preserved for all other model names.Full test suite result: 5709 passed, 2300 warnings.
Manual End-to-End (E2E) Tests:
Setup:
Steps:
Before the fix for gemma4 models: history contained both a phantom role: "tool" error message and the real role: "tool_responses" result, causing the model to process the error last and loop.
After the fix: history contains only the real role: "tool_responses" result.
Checklist